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Abstract — In this paper we analyze different biometric 
authentication protocols considering an internal adversary. Our 
contribution takes place at two levels. On the one hand, we 
introduce a new comprehensive framework that encompasses 
the various schemes we want to look at. On the other hand, 
we exhibit actual attacks on recent schemes such as those 
introduced at ACISP 2007, ACISP 2008, and SPIE 2010, and 
some others. We follow a blackbox approach in which we 
consider components that perform operations on the biometric 
data they contain and where only the input/output behavior of 
these components is analyzed. 
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I. Introduction 

ALTHOUGH biometric template protection is a relatively 
young discipline, already over a decade of research has 
brought many proposals. Methods to secure biometric data can 
be separated in three levels. The first one is to have biometric 
data coming in a self-protected form. Many algorithms have 
been proposed: quantization schemes [1], [2] for continuous 
biometrics; fuzzy extractors [ ] and other fuzzy schemes [4]~ 
[6] for discrete biometrics; and cancellable biometrics [7]- 
[9]. The security of such template-level protection has been 
intensively analyzed, e.g., in [10]— [13], On a second level one 
can use hardware to obtain secure systems, e.g., [14], [15]. 
Finally, at a third level advanced protocols can be developed 
to achieve biometric authentication protocols relying on ad- 
vanced cryptographic techniques such as Secure Multiparty 
Computation, homomorphic encryption or Private Information 
Retrieval protocols [16, Ch. 9] [17]-[24]. 

The focus of our work is on this third level. In this work, 
we analyze and attack some existing biometric authentication 
protocols. We follow [25] where an attack against a hardware- 
assisted secure architecture [ ] is described. The work of 
[25] introduces a blackbox model that is taken back and 
extended here. In this blackbox model, internal adversaries 
are considered. These adversaries can interact with the system 
by using available input/output of the different functionalities. 
Moreover, the adversaries are malicious in the sense that they 
can deviate from the honest-but-curious classical behaviour, 
which is most often assumed. 

Our contributions are the following. We extend the blackbox 
framework initiated in [ ] with the distributed system model 
of [19] in a way that it can handle different existing proposals 
for biometric authentication. We show how this blackbox 



approach can lead to attacks against these proposals. We 
describe in detail our analysis of three existing protocols [19], 
[20], [22] and give arguments on some others [23], [24]. In the 
framework we propose, we study how the previous attacks can 
be formalized. We list all the possible existing attacks points 
and the different internal entities that can lead the attacks, and 
we reveal the potential consequences. 

The rest of the paper is organized as follows. The framework 
is developed in Section II and introduces the system and attack 
model. This is then applied to existing protocols in Section III, 
where detailed attacks are described. Section IV formalizes 
these attacks and Section V concludes the paper. 

II. Framework 

In this section we present a framework that forms a basis 
for the security analysis of biometric authentication protocols. 
The framework models a generic distributed biometric system 
and the (internal) adversaries against such system. We define 
the roles of the different entities that are involved and their 
potential attack goals. From these roles and attack goals we 
derive the requirements that are imposed on the data that are 
exchanged between the entities. 

Biometric Notation: Two measurements of the same 
biometric characteristic are never exactly the same. Because 
of this behavior, a biometric characteristic is modeled as a 
random variable B, with distribution ps over some range B. 
A sample is denoted as b. Two samples or templates are related 
if they originate from the same characteristic. In practice, we 
will say they are related if their mutual distance is less than 
some threshold. Therefore, a distance function d is defined 
over B and for each value in the range of d that is used as 
the threshold when comparing two samples a false match rate 
(FMR) and a false non-match rate (FNMR) can be derived. 

Biometric variables can be continuous or discrete but in the 
remainder of the paper we will assume that they are discrete. 
Note that the variables may consist of multiple components. 
For example, a sample may consist of a bitstring, which is the 
quantized version of a feature vector, and an other bitstring 
that indicates erasures or unreliable components in the first 
and thus act as a mask. 

A. System Model 

Our system model follows to a large extent the model 
defined by Bringer et al. [19], which was also used to define 
new schemes in [20] and [26]. This model is motivated by 
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a separation-of-duties principle: the different roles for data 
processing or data storage on a server are separated into three 
distinct entities. Using distributed entities is a baseline to avoid 
one to control all information and it is a realistic representation 
of how current biometric systems work in practice (cf. [27]). 

System Entities: The different entities involved in the 
system are a user Ui, a sensor S, an authentication server 
AS, a database VB and a matcher A4. User Ui wishes to 
authenticate to a particular service and has, therefore, regis- 
tered his biometric data bi during the enrollment procedure. 
In the context of the service the user has been assigned an 
identifier IDi, which only has meaning within this context. 
The biometric reference data bi are stored by VB, who links 
the data to identifier i. The mapping from IDi to i is only 
known by AS, if relevant. Note that in some applications it is 
possible that the same user is registered for the same service 
or in the same database with different samples, b$ and bj, and 
different identities, i.e., IDi 7^ IDj in the service context or 
i j in the database context. The property of not being able 
to relate queries under these different identities is the identity 
privacy requirement as defined in [19]. 

During the authentication procedure the sensor S captures 
a fresh biometric sample b[ from user Ui and forwards 
the sample to AS. The authentication server AS manages 
authorizations and controls access to the service. To make 
the authorization decision, AS will rely on the result of the 
biometric verification or identification procedure that is carried 
out by the matcher M. It is assumed that there is no direct 
link between M. and T>B. As such, AS requests from VB 
the reference data that are needed by M and forwards them 
to M. It is further assumed that the system accepts only 
biometric credentials. This means that the user provides his 
biometric data and possibly his identity, but no user-specific 
key, password or token. Fig. 1 shows the participating entities. 

Functional Requirements: Enrollment often involves off- 
line procedures, like identity checks, and is typically carried 
out under supervision of a security officer. Therefore, we 
assume that users are enrolled properly and only authentication 
procedures are analyzed in our framework. A distinction has 
to be made between verification and identification. Verification 
introduces a selection step, which implies that VB returns 
only one of its references, namely the bi that corresponds to 
the identifier i that is used in the context of the database. 
The entity that does the mapping between IDi and i, when 
applicable, is generally AS. In identification mode, VB returns 
the entire set of references, in some protected form, to AS. 
The database can then be combined with b' i and forwarded to 
A4. The matcher M. has to verify that b' { matches with one 
or a limited number of bi in the received set of references or 
that one of the matching references has index i. 

We define the minimal logical functionality to be provided 
by our system entities in terms of generic information flows, 
which are included in our model in Fig. 1. In this functional 
model, we represent the result of the biometric comparison as a 
function of the distance d{b[, bi). This is a generic representa- 
tion of the actual comparison method: M can evaluate simple 
distances but also run more complex comparisons and will 
output either similarity measures or decisions that are based 



on some threshold t. The information flows are as follows. 

User Ui presents a biometric characteristic Bi that will 
be sampled by the sensor S to produce a sample b^. When 
operating in verification mode Ui will claim an identity IDi '■ 



Ui > S or Ui > S . 

The sensor S forwards 6' and IDi in some form to AS: 



(1) 



S > AS or S > AS . (2) 

In general gi(IDi) — IDi but it can also be a mapping to 
an encrypted value to hide IDi from AS. If applicable AS 
resolves the mapping g\(IDi) to the identifier i and requests 
reference data for one or more users from VB by sending at 
least one request g 2 (b' i ,i) : 



AS " - - VB. 



(3) 



Note that the function g 2 does not necessarily use all the 
information in its arguments, e.g., the fresh sample b[ L may 
be ignored. 

Database VB provides AS with reference data for one or 
more users in some form. It is possible that VB returns the 
entire database, e.g., in case of identification: 



AsS^VB. 



(4) 



The authentication server AS forwards the fresh sample b\ 
and the reference data bi in some combined form to M : 



AS > M . 



(5) 



Note that AS has only /i(^) and fiipi) at his disposal to 
compute f 3 (b'i, {bi}) . 

The matcher M performs a biometric comparison procedure 
on the received b[ L and {bi} and returns the result to AS. The 
result may contain decisions or scores or different identities but 
should at least be based on one distance calculation between 
the fresh sample b\ and a reference bi : 



M. 



(6) 



Different data are stored by the different entities. The 
database stores references {bi}. The authentication service 
stores the information needed to map gi(IDi) to i, if appli- 
cable. The matchers can store non-biometric verification data, 
e.g., hashes of keys extracted from biometrics, or decryption 
keys that are use to recover the result of combining sample and 
reference. Also, the sensor can store key material to encrypt 
the fresh sample. 

B. Adversary Model 

Attacker Classification: Based on the physical entry point 
of an attack a distinction is made between two types of 
attackers: internal attackers are corrupted components in the 
system and external attackers are entities that only have access 
to a communication channel. We will consider here only the 
issue of an insider attacker. As a baseline, we make the 
following assumption. 
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A 3 

Fig. 1. System model with indication of generic information flows and attack points Ai- User W;'s biometric is sampled by sensor S. The sample 
and Ui's identity are forwarded to the authentication server .45, who requests the corresponding reference bi from database T>B. AS combines the sample 
and the reference and forwards the result to matcher M, who performs the actual comparison and returns the result to AS. The solid arrows represent the 
messages exchanged between the system entities. The dashed arrow represents the implicit feedback on the authentication result to the user Ui, i.e., access 
to the requested service is granted if the sample matches the reference. 



Assumption 1: The protocol ensures the security of the 
scheme against any external attacker. 

As this can be reached by classical secure channel techniques, 
by an external security layer independent of the core protocol 
specification, we study further only the internal layer. Note that 
the security of the scheme needs to be expressed in terms of 
specific attack goals, which will be defined in the next section. 

A second distinction is made based on an attacker's capa- 
bilities. Passive attackers or honest-but-curious attackers are 
attackers that only eavesdrop the communications in which 
they are involved and that can only observe the data that passes 
through them. They always follow the protocol specifications, 
never change messages and never generate additional commu- 
nication. Active or malicious attackers are internal components 
that can also modify existing or real transactions passing 
through them and that can generate additional messages. We 
mainly focus on malicious internal attackers and we formulate 
the following additional assumption. 

Assumption 2: The protocol ensures the security of the 
scheme against honest-but-curious entities, i.e. internal system 
components that always follow the protocol specifications but 
eavesdrop internal communication. 

We will explain in Section II-C how this has a direct impact 
on the properties of the different functionalities of our model. 

Finally, we put aside the threats on the user or client side, by 
concentrating the analysis on the remote server's side, i.e., AS, 
T>B or Ai . The information leakage for the user and the client 
is generally only the authentication or identification result. 
They can, however, try to gain knowledge on the reference 
data bi by running queries with different b[, e.g., in some 
kind of hill climbing attack. The difficulty can highly vary 
depending on the modalities, the threshold and the scenario. 
A basic line of defense is to limit the number of requests, to 
ensure the aliveness of the biometric inputs provided by the 
user and to hide the result when applicable. Although it is 
important to implement such defense mechanisms, the threats 
are inherent to any biometric authentication or identification 
system. So we do not take the user or the sensor into account 



as an attacker in this model and the primary attack points are 
AS, T)B and Ai. Nonetheless, there may be inside attackers 
that also control the biometric inputs to some extent. We model 
this with a secondary attack point at the sensor. 

Assumption 3: The user Ui or the sensor S cannot be 
attackers on their own but they can act as a secondary attack 
point in combination with a primary attack point at AS, T>B 
or Ai. If this the case an attacker can choose the input sample 
6; through S and observe whether the authentication request 
was successful through Ui. 

Of course, the baseline assumptions have to be checked 
before proceeding with a full analysis of the security of a 
scheme, but as such, they clarify what the big issues are that 
may remain in state-of-the-art schemes. They also underline 
what the hardest challenges are when designing a secure 
biometric authentication protocol. Fig. 1 sums up the different 
attack points we consider in our attack model. 

Attack Goals: As noted above, the security of a scheme 
is expressed in terms of specific attack goals or adversary 
objectives. Therefore, we define the following global attack 
goals. 

• Learn reference bi . In accordance to the security 
definitions in [25] we define different gradations in the 
information that an attacker may want to learn from bi. 
Minimum leakage refers to the minimum information that 
allows, e.g., linking of references with high probability. 
Authorization leakage is the information that is needed 
to construct a sample that is within distance t, the system 
threshold, of the reference bi. Full leakage gives full 
knowledge of b. L . When a scheme is resistant to this attack 
in all three gradations we say that it provides biometric 
reference privacy. 

• Learn sample b' { . The same gradations apply as in 
the previous attack goal. We call the security property 
associated with this attack goal biometric sample privacy. 

• Trace users with different identities. This attack can 
be achieved when different references from the same 
user, possibly coming from different applications, can be 
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TABLE I 

Relevance of attack goals for different (malicious) entities in 
the system model (? = only relevant if the scheme under 
consideration was designed to hide references from vb; * = 

only relevant if the protocol operates in identification mode 
or if idi and i are hidden from as in verification mode). 

Attack goal ~M VB M~ 

Learn b t V ? V 

Learn b' t V V V 

Trace Ui with different identities V ? V 

Trace Ui over different queries V* V V 



linked. A system that is resistant to such attack is said to 
provide identity privacy [26]. 

• Trace users over different queries. This attack refers 
to linking queries, whether anonymized or not, based on 
i, bi or b\. The property of a system that prevents such 
attack is called transaction anonymity [26]. Note that an 
attacker that is able to learn b\ can automatically trace 
users based on the learned sample. 

The formulated attack goals may apply to the different 
internal attackers as indicated by the different attack points. 
The relevance of the attack goals is shown in TABLE I. Attack 
goals can be generalized for combinations of inside attackers, 
e.g., AS and M.. They are relevant for the combination if they 
are relevant for each attacker individually. As a counterexam- 
ple, learning bi is not always relevant for the combination AS- 
VB. In some schemes it is assumed that VB stores references 
in the clear so the attack "learn b" becomes trivial. It is 
important, however, that such schemes explicitly mention the 
assumption that VB is fully trusted. It will become clear in 
the further sections that the main focus of our work is on AS 
who is a powerful attacker. This way of thinking is rather new 
and many protocols are not designed to be resistant to such 
attacker. 

For each attacker or combination of attackers, and for each 
relevant attack goal a security requirement can be defined, 
namely that the average success probability of the given 
attacker that mounts the given attack on the scheme should 
be negligible in terms of some security parameter defined 
by the application. When analyzing the security of biometric 
authentication protocols that include distributed entities, each 
of these requirements should be checked individually. 

C. Requirements on Data Flows 

Coming back to the functionalities in our system model (cf. 
Section II-A), we use the attack goals defined in TABLE I to 
impose requirements on the data that are being exchanged. 

• AS should not be able to learn b\ hence fi is at least 
one-way, meaning that b[ should be unrecoverable from 
/l(^) with overwhelming probability. To prevent tracing 
Ui over different queries, e.g., in identification mode, it 
could also be required that fi is semantically secure. We 
note that semantic security is a security notion that might 
be too strong but it ensures that the function prevents the 
minimum leakage as described under attack goal learn bi 
(Section II-B). 



• AS should not learn bi hence f 2 is at least one-way. To 
prevent tracing users with different identities it may be 
required that f 2 is also semantically secure. 

« If applicable, AS should not be able to trace Ui by linking 
queries on IDi or i, and thus g\ should be semantically 
secure. 

• If applicable, VB may not learn bi, hence the bi would 
need to be stored in protected form using some semanti- 
cally secure function. 

• VB may not learn b\, hence g 2 is one-way on its first 
input. It should also be semantically secure to prevent 
tracing U.. 

• VB may not be able to link the queries at all, hence g 2 
should also be semantically secure on its second input. 

• M may not learn the individual bi or b\ and must not 
be able to link references or queries from the same Ui, 
hence fs should be semantically secure on tuples (b^bj) 

Now as we demand that M. returns a result to AS that 
is a function {f±) of the distance while maintaining 

the confidentiality and the privacy of the data, this means that 
some operations must be malleable. Malleability refers to the 
property of some cryptosystems that an attacker can modify a 
ciphertext into another valid ciphertext that is the encryption of 
some function of the original message, but without the attacker 
knowing this message. Depending on the exact step when the 
combination of bi and b\ is realized, either g 2 , f 2 or would 
be malleable. In the following section, we will show the impact 
of this fundamental limitation and how it can be exploited to 
attack existing protocols. 

III. Application to Existing Constructions 

In this section, we begin to extend attacks that have been 
introduced by Bringer et al. in [25] in the context of hardware 
security to more complex cryptographic protocols that use 
homomorphic encryption in Section III-A for a scheme by 
Bringer et al. [19] and in Section III-B for a scheme by 
Barbosa et al. [ ]. We then describe another kind of attacks 
by looking at a scheme by Stoianov [22] in Section III-C. 
Finally, we briefly discuss attacks on two other schemes [23], 
[24] in Section III-D. All schemes are described with the goal 
to fit them directly into our model. 

A. Bringer et al. ACISP 2007 

1) Description: In [19], Bringer et al. presented a new 
security model for biometric authentication protocols that 
separates the tasks of comparing, storing and authorizing an 
authentication request amongst different entities: a fully trusted 
sensor S, an authentication server AS, a database VB and 
a matching service Ai. The goal was to prevent any of the 
latter three to learn the relation between some identity and the 
biometric features that relate to it. Their model forms the basis 
of our current framework and in this model they presented a 
scheme that applies the Goldwasser-Micali cryptosystem [28]. 
Let £cm an d gm denote encryption and decryption, respec- 
tively, and note that for any m, m' 6 {0, 1} we have the ho- 
momorphic property V GM (£ GM (m,pk) x £ GM (m' ,pk), sk) = 
m © to'. The scheme in [ ] goes as follows. 
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During the enrollment phase, the user Ui registers at the 
authentication server AS. He then gets an index i and a 
pseudonym IDi. Let N denote the total number of records 
in the system. Database VB receives and stores (bi,i) where 
hi stands for Hi's biometric template, a binary vector of 
dimension M, i.e., 6, = (£>t 1, &i,2j ■ • ■ , h,M)- In the following, 
we suppose that i is also the index of the record bi in the 
database VB. 

A key pair is generated for the system. Matcher M. pos- 
sesses the secret key sk. The public key pk is known by S, 
AS and VB. The authentication server AS stores a table of 
relations (IDi,i) for i G {l,...,iV}. Database VB contains 
the enrolled biometric data b\, . . . , b]y 

When user Ui wants to authenticate himself, the S will 
send an encrypted sample £Gu(b'i,pk) and IDi to AS. The 
authentication server AS will request the encrypted reference 
£cM{bi,pk) from VB and combine it with the encrypted 
sample. Because of the homomorphic property, AS is able 
to obtain £gm(^ © bi,pk). Note that the encryption is bitwise 
so AS will permute the M encryptions and forward these to 
M. Because M. has the secret key sk, M. can decrypt the 
permuted XOR-ed bits and compute the Hamming distance 
between the sample and the reference. 

The security of this protocol is proved in [ ] under the 
assumption that all the entities in the system will not collude 
and are honest-but-curious. It is this assumption that we 
challenge in our framework, which leads to the following 
attack. 

2) Authentication Server Adversary (A=AS): The follow- 
ing attack shows how a malicious authentication server AS 
can learn the enrolled biometric template bi corresponding to 
some identity IDi. To do so the authentication server AS re- 
quests the template bi without revealing IDi and receives from 
VB the encrypted template that was stored during enrolment, 
i.e., £ C u{bi,pk) = {£cM(bi,i,pk), . . . , £cM{biM,pk)). 

The attack consists of a bitwise search performed by AS in 
the encrypted domain. First AS computes the encryption of a 
zero bit £GM(0,pfc). If the public key is not known by AS, he 
can take an encrypted bit of the template retrieved from VB 
and compute £cM(bi,k,pk)° = £gm(0, pk). Let the maximum 
allowed Hamming distance be t. 

Now AS will take the first encrypted bit £GM(bi,i,pk), 
repeat it t+1 times and add M—t—1 encryptions of a zero bit. 
Note that the ciphertext £GM{bi,i,pk) can be re-randomized 
so that it is impossible to detect that the duplicate ciphertexts 
are "copies". If bi,i is one, the total Hamming distance as 
computed by M. will be t + 1 and M will return NOK (not 
ok). If 6j i is zero, the Ai will return OK. This process can 
be repeated for all bits of bi, hence, AS can learn bi bit by 
bit in M queries. To further disguise the attack AS can apply 
permutations and add up to t encryptions of one-bits to make 
the query look genuine. 

3) Matcher and Sensor Adversary ( A=Ai+S): A bitwise 
search attack similar to the previous attack can also be 
considered in the case of an adversary made of the matcher 
assisted by the sensor. The attack consists of the following 
steps: 

• S sends the encryption of = (0, . . . , 0) ; 



> M. receives bi @ bitwise but permuted and records the 
weight of bi © ; 

• <S toggles a bit in the vector in position x and sends it 
to AS; 

> M observes the changed weight (+1 or -1) and learns the 
bit at position x in bi . 

The adversary learns bi in M queries. 

4) Discussion: What makes the first attack (A=AS) fea- 
sible is that all bits are encrypted separately and that the 
cryptosystem is homomorphic and thus fiib^) and fiipi) are 
malleable (needed to create the encryption of a zero-bit if the 
public key is not known). Moreover, it is not enforced that AS 
combines the input from the sensor and from the database. 

To counteract this threat, one could require S to sign the 
input and force VB to merge the input with references, in 
this way VB combines the sample and the reference and AS 
does not receive the reference £cu{bi,pk) but the combination 
of the sample £GM{b' i © bi,pk). Using the previous attack, 
however, AS can still learn b\ and the b\ © bi. Additional 
measures have to be taken to prevent this, e.g., VB could be 
required to sign £gm(^ ffi bi,pk), which will be verified by 
M.. Note that in the case where AS and VB collude, these 
countermeasures are not sufficient anymore. 



B. Barbosa et al. ACISP 2008 

1) Description: In [20] Barbosa et al. presented a new pro- 
tocol for biometric authentication, following [ ] (see previous 
Section 111- A). A notable difference between these two comes 
from the fact that [ ] compares two biometric templates by 
their Hamming distance, enabling biometric authentication, 
whereas [ ] classifies one biometric template into different 
classes thanks to the SVM classifier (support vector machine, 
see [ ] for details) leading to biometric identification. Bio- 
metric templates are represented as features vector where 
each feature is an integer, i.e., bi = (&t,i, • • • , &i,fe) £ N fc . 
Barbosa et al. encrypt this vector, feature by feature, with 
the Paillier cryptosystem [30]. In particular, they exploit its 
homomorphism property to compute its SVM classifier (think 
of a sum of scalar products) in the encrypted domain. 

However, as we explain further below in this section, as 
the features are encrypted one by one, an adversary can do 
something similar as the attack described in the previous 
section (Section III-A). 

Let fpaMier (resp. Ppainier) denote the encryption (resp. 
decryption) with Paillier's cryptosystem. This cryptosystem 
enjoys a homomorphic property which ensures that the product 
of two encrypted texts corresponds to the encryption of their 
sum: for m 1: m 2 € Z n we have that X>paMer(£pafflei-(?Tii) x 
£paMer(w2)) = vfi\ +TO2 mod n . Note that Z„ is the plaintext 
space of the Paillier cryptosystem. 

The SVM classifier takes as input U classes (or users) 
and S samples per class, and determines support vectors 
SVij and weights ctij for 1 < i < S and 1 < j < U. 
Following the notation in [20], let v = (vi, . . . , V).) = b t 
denote a freshly captured biometric sample. For this sample 
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the classifier computes 

S k 

4vm(«) = E a ^ E v > Wd), for i = 1, . (7) 

With this vector dsvivi(w), it is possible to determine which 
class is the most likely for v or to reject it. The support vectors 
SVij and the weight coefficients ctij are the references that 
are stored by VB. 

Briefly, the scheme of Barbosa et al. works as follows: 

1) The sensor S captures a fresh biometric sample and 
encrypts each of the features of its template v = 
(vi,...,Vk) with Paillier's cryptosystem and sends 
it to the authentication server AS. Let auth = 

(£pafflier(«l), • ■ • , £paillier(wfc))- 

2) The database VB computes an encrypted version of 
the SVM classifier for this biometric data: Cj = 

ULi(ULil auth j}\ SV '' 3]l ) ai - 3 where [•]* denotes the 
Z th component of a tuple. This Cj corresponds to the 

encryption of the cl^) u with Paillier's cryptosystem as 

defined above. The database returns the values Cj to AS. 

3) The authentication server AS scrambles the values Cj 
and forwards them to AL 1 . 

4) The matcher Ai, using the private key of the system, 
decrypts the components of the SVM classifier and per- 
forms the classification of v. The classification returns 
the class for which the value cl^j M is maximal. 

5) Based on the output of AL AS determines the real 
identity of Ut (in case of non-rejection). 

2) Authentication Server Adversary (A=AS): The fol- 
lowing attack shows how a malicious AS can recover the 
biometric references. In this scheme, the biometric reference 
data that are stored by VB, i.e., the support vectors SVij 
and the weight coefficients a^j, represent hyperplanes that are 
used for classification. These fc-dimensional hyperplanes are 
expressed as linear combinations of enrolment samples (the 
support vectors). We will show how these hyperplanes can be 
recovered dimension by dimension. 

Let us rewrite (7) as 

s s 

c1 SVm( v ) = v l E a hi( SV i,j)l + ■ ■ ■ + V k E a *A SV i,i) k 

= ViPj >x H h Vkf)j,k ■ 

By sending a vector v = (1, 0, . . . , 0) to VB, AS will retrieve 

the encryption of (3j^ = 2^2i=i a i,j(^^i-j)i f° r eacn user ' 
indexed by j, in the database. 

Instead of sending all Cj = £p a iiiier(/3j.i) to AL only one 
value will be kept by AS, e.g., c\ — £p a iiiier(/3i,i)- The 
authentication server will set c 2 = £p a iiiiei-(£) for some value 
x e Z„ and all other Cj — £p a iiiiei-(0). The matcher M. will 
return the index of the class with the greatest value, which is 
1 if Pis > x and 2 if /3x,i < x. 

The initial value of x = n/2. If /3i t \ > x then AS will 
adjust x to n/2 + n/A, otherwise x — n/2 — n/A. By repeating 
this process and adjusting the value x, the exact value can 

'in [20], the entity that makes the decision is refered to as the verification 
server. To be consistent with our model we continue to use the term matcher. 



be learned after log 2 n queries. Hence, the reference data of a 
single user can be learned in fc log 2 n queries to the matcher. 

with the permutation). Quite logical, as the matcher is 
determining a list of candidates. In particular, although the 
identifiers are permuted, he can detect if related inputs are 
used, to trace the user whole database (with a known input) 

3) Discussion: As in Section III- A this attack succeeds 
because features are encrypted separately and there is no check 
to see if the sample and the reference were really merged. The 
same attack can in principle be used to learn any information 
about the input sample. 

C. Stoianov SPIE 2010 

1) Description: In [22], Stoianov introduces several au- 
thentication schemes relying on the Blum-Blum-Shub pseudo- 
random generator. We focus on the database setting from the 
paper (cf. Section 7 of [22]. In this setting there is a service 
provider SP that performs the verification. Consistent with our 
model, we will call this entity the matcher AL Sample and 
reference are combined before being sent to A4 and although 
this is not explicitly mentioned in [ ] we designate this 
functionality to the authentication server AS in our model. 

In the schemes of [22], the biometric data b are binarized 
and are combined with a random codeword c coming from 
an error-correcting code to form a secure sketch or code 
offset b © c where © stands for the exclusive-or (XOR). 
When a new capture b' is made, whenever b' is close to b 
(using the Hamming distance) it is possible to recover c from 
b © b' ® c using error correction. This technique is known as 
the fuzzy commitment scheme of Juels and Wattenberg [5]. 
An additional layer of protection is added by encrypting the 
secure sketch using Blum-Goldwasser. 

The Blum-Blum-Shub pseudo-random generator [ ] is a 
tool used in the Blum-Goldwasser asymmetric cryptosystem 
[32]. From a seed xq and a public key, a pseudo-random 
sequence S is generated. In the following, S is XOR-ed to 
the biometric data to be encrypted. By doing so, the state of 
the pseudo-random generator is updated to Xt+i- From Xt+i 
and the private key, the sequence S can be recomputed. 

In this system of Stoianov, A4 generates the keys and sends 
the public key to S. On enrollment 

1) Sensor S computes (S © b ffi c, Xt+i) where: 

• Sample b is the freshly captured biometric data, 

• String S is a pseudo-random sequence and Xt+i is 
the state of the Blum-Blum-Shub pseudo-random 
generator as described above, and 

• c is a random codeword which makes the secure 
sketch c ffi b; 

2) Sensor S sends S ffi b ffi c to VB; 

3) Sensor S sends Xt+i and H(c) to M where H is a 
cryptographic hash function. 

Using the private key, M. computes S from Xt+i and stores 
it along H(c). Periodically, M (resp. VB) updates S (resp. 
S ffi b ffi c) to S (resp. S ffi c ffi b) with an independent stream 
cipher. 

During authentication sensor S receives a new sample b' and 
forwards (5" © b',x' t+1 ) to AS, where S' is a new pseudo- 
random sequence. It is assumed that there is some sort of 
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authentication server AS that retrieves S®c(Bb from VB and 
merges it with S' ffi b'. Finally S' ffi b' ffi S ffi & © c and x' t+1 
are sent to .M. Using the private key A4 recovers S'. From S' 
and 5, At computes c © & © 6', tries to decode it and verifies 
the consistency of the result with H(c). 

2) Matcher Adversary (A=A4): Let M. be the primary 
attacker. It is inherent to the scheme that M can always trace a 
valid user over different queries by looking at the codeword c, 
which is revealed after a successful authentication. Depending 
on the entity that colludes with M additional attacks can be 
deviced. 

If M and VB collude (A=M+VB) they learn the sketch 
c © b. This implies that they can immediately trace users with 
different identities following the linkability attack based on 
the decoding of the sum of two sketches as described in [11]. 
From a genuine match, A4 learns c and thus also b. 

If M and S collude (A=A4+S) they control and always 
learn the input sample b'. By setting b' = they learn c ffi b 
from a single query. If a successful authentication occurred, 
the adversary learns everything. 

If M and AS collude (A=M+AS) they always learn the 
input sample b'. They can learn the sketch c ffi b for any 
reference and thus trace users with different identities as in the 
case (A=A4+VB). They learn the reference b after successful 
authentication. 

exhaustive search block by block in case of an accept to 
reconstruct b + b'... 

3) Authentication Server Adversary (A=AS): In the cur- 
rent scheme, bits are not encrypted bit per bit independently. 
Moreover, they are masked with streams generated via Blum- 
Blum-Shub and a codeword so attacks as in Sections III-A 
and III-B are no longer possible. Nevertheless, there is still a 
binary structure that AS may exploit. 

Assume that AS knows S' © b' that leads to a positive 
decision, i.e., M. accepts b' because d(b, b') < t . Then AS 
can start from S' ffi b' and add progressively some errors until 
he reaches a negative result. Then, he backtracks one step by 
decreasing the error weight by one to come back to the last 
positive result. This gives AS an encrypted template S' ffl b". 
Consider now the vector S' ffi b" ffi S ffi c ffi b and replace the 
first bits (say of small length I) by a I bits vector x. 

• For all possible values of x, AS sends the resulting vector 
(the first block is changed by the value x) to M. who acts 
as a decision oracle. 

• If several values give a positive result, then AS increases 
the errors on all but the first block. 

> This is repeated until only one value of x gives a positive 
result. 

• When this step is reached, AS has found the value x with 
no errors, i.e., he learns the first block of S' ffi S ffi c. 

• AS proceeds to the next block. 

Following this strategy, it is feasible to recover all the bits of 
b®V. If AS colludes with S, he can retrieve the full reference 
template b as soon as S knows one sample that is close to b. 
We call this attack a center search (cf. Section IV below). 

4) Discussion: In a way similar to the inherent traceability 
of users by Ai, there are no mechanisms described that protect 



against the database tracing users over different queries, i.e., 
by tracking S + c + b lookups. 

We note that the matcher M. is very powerful because he 
knows the secret key, which allows computing S', and S. As 
soon as M colludes with one of the other entities he is able 
to learn everything from a genuine match or a false accept. 

D. Other Schemes 

Due to the generic design of our model, several other 
schemes in the literature fit our model. Nevertheless, as they 
are not always designed with the same entities, an adaptation 
might be required. Some others are not compatible at all; for 
instance those for which the security relies on a user-secret 
key stored on the user side. We now present a brief overview 
of the schemes [23], [24] when analyzed in our model. 

ACM MMSec 2010 eSketch: This scheme of Failla et 
al. is described in [ ] following a client-server model. The 
client corresponds to AS and the server can be logically 
separated into VB and M.. The goal of the scheme is to 
provide anonymous identification. The VB stores data derived 
from the biometric reference, in particular secure sketches, and 
part of the data is encrypted via the Paillier cryptosystem with 
the corresponding secret key owned by AS. The identification 
query is implemented through different exchanges between the 
entities and at one step the same randomness is used to mask 
all the different reference templates and the masked values are 
sent to AS. Consequently, an authentication server adversary 
(.4 = AS) learns the whole database after one successful 
authentication, because the client (AS in our model) knows the 
Paillier secret keys. If the adversary consists of the database 
and the matcher (.4 = VB + M.), it is also possible to learn 
the reference template, which is supposed to be hidden for the 
server. 

ACM MMSec 2010 Secure Multiparty Computation: This 
scheme of Raimondo et al. [ ] is also described following 
a client-server model with secure multiparty computation 
between them to achieve an identification scenario (authen- 
tication scenario as well, cf. [24, Fig. 3]). The scheme is 
not made to be resistant against malicious adversaries. Fitting 
it in our model, we have AS which obtains the result and 
VB which stores all the references in clear; AS sends an 
encrypted (via Paillier) query to VB; VB sends back to AS 
all the entries combined with the query (this gives in fact a 
database containing all the Euclidean distances) and then AS 
and A4 interact (secure multiparty protocol) to output the list 
of identifiers for which the distance is below a threshold. Here 
again encryption of the query is made block by block, so a 
similar strategy as in Section III-B is possible when A = AS. 
An adversary A = VB + A4 can also tamper the inputs to the 
last part of the protocol to learn information about the query. 

IV. Formalization of Attack Scenarios 

The goal of this section is to explore some generic attack 
scenarios that can be used for analyzing actual protocol 
specifications. These attacks are presented in the framework 
as described in Section II and generalize the attacks of the 
previous Section III. As explained in Section II we only 



consider malicious internal attackers, i.e., AS, VB, Ai and 
combinations of these entities. User Ui and the sensor S have 
been excluded as individual attackers. 

A. Blackbox Attack Model 

The different attacks that can be carried out by the attackers 
are modeled as blackbox attacks, following recent results 
from [25]. This allows us to clearly specify the focus of 
the attack. Our blackbox-attack model consists of two logical 
entities: 

1) The attacker, i.e., one or more system entities. These 
entities are fully under control of the attacker: internal 
data are known, messages can be modified and addi- 
tional transactions can be generated. 

2) The target or the blackbox, i.e., the combination of all 
other system entities. The attack is focused on the data 
that are protected by the system components within the 
blackbox. 

The target is modeled as a blackbox because the attacker 
can only observe the input-output behavior of the box. This 
adequately reflects remote protocols where only the communi- 
cation can be seen by the attacker. No details are known about 
the internal state of the remote components. During the attack, 
the attacker will "tweak" inputs to the blackbox. However, all 
communication must comply with the protocol specification. 
Any messages that are malformed or that are sent in the wrong 
order are rejected by the blackbox. 

It should be noted that there are cases in which the attacker 
cannot generate additional transactions because he has to 
follow the protocol specifications. E.g., if VB is attacking 
he has to wait until a request is received from AS. When 
analyzing protocols it should be assumed that this will occur 
with a reasonable frequency. If relevant, attack complexity can 
be expressed in function of this frequency. Similarly, if the 
attacker is AS, he receives inputs from S and communicates 
with VB and Ai. In this case we exclude Ui and S from 
the blackbox. It should be assumed, though, that a number of 
inputs from S is available to AS. This does not necessarily 
imply that S is under control of AS. The analysis of the attack 
can take into account the amount of data that is available. 

We will now consider a number of possible adversaries and 
blackbox attacks in our framework. 

B. Attacker = AS 

Decomposed Reference Attack: Let's assume that only 
one reference fcj is returned by VB. The goal of this attack 
is to learn bi. Biometric samples or references are often 
represented as a "string", i.e., a concatenation (let || denote 
concatenation) of (binary) symbols. Let's assume that fzibi) 
is the concatenation of a subfunction fi that is applied on 
each of the n components b' i j of b' i individually. If AS has to 
combine fy,(bi) and without knowing either the sample 

or the reference, it is likely that f\ and f% will also be the 
concatenation of component-wise applied subfunctions, i.e., 
h{k,b'i) = h(bi,i,b' i>x )\\ ■ ■ ■ \\M\n,K, n )- Note that in our 
model AS can generate the value Hipij^j) but this value 



should not reveal to AS whether the inputs are the same or 
not. This decomposition of references is used in the scheme 
analyzed in Section III-A and the following attack applies to 
it. 

Suppose that AS is able to generate a value that is valid 
output of when the two component inputs bij and b[ ^ 
are the same and similarly when they are not the same, e.g., 
the output is the encryption of one or zero. If AS can also 
compute fi, then AS can fully reconstruct To do so AS 
choose the first component of b' i at random, combines it with 
the first component of bi and sends the result to Ai . The other 
components that are sent to Ai are such that t of those are 
an output of /a that reflects different inputs and the n — t — 1 
remaining components are outputs that reflect equal inputs. 
Note that t is the comparison threshold. If the guess of AS 
for the first component is correct then Ai will return a positive 
match. Otherwise the guess is wrong and AS can try again. 
This process can be repeated until all components of bi are 
recovered. For binary samples, this requires n queries to Ai 
and 1 query to VB. 

As shown in Section III-B a similar attack can be executed 
if the biometric data are represented as real-valued or integer- 
valued feature vectors. However, more queries might be re- 
quired to get an accurate result. 

Center Search Attack Using S: In this attack, S is also 
compromised and under the control of the attacker. The attack 
goal is to learn the full reference bi from a close sample. The 
input sample is obviously always known to AS and S. Thus 
at some point in time Ui will present a sample 6' that matches 
reference 6j. This sample will lie at some distance from the 
reference. In the case where biometrics are represented as 
binary strings and the system implements a hamming distance 
matcher the attacker can recover the exact 6, as follows. 

The sensor flips the first bit of b\ and sends the new sample 
to AS who performs the whole authentication procedure. If 
the authentication succeeds S flips the second bit, leaving the 
first bit also flipped, and sends the sample to AS who follows 
the procedure again. This continues until the sample no longer 
matches bi. Then the sensor starts again by restoring the first 
bit of the sample that is no longer accepted and forwards it 
to AS. If it gets accepted this means that the first bit of the 
original sample b\ was the same as the first bit of bi. If not, 
then the first bits were different. One by one the bits in b\ that 
are different from those in bi can be corrected. This technique 
was demonstrated in Section III-C. 

We call this the center search attack because we start from 
a sample that lies in a sphere with radius t, the matching 
threshold, and the reference as center point. The goal of this 
attack is to move the sample to the center of the sphere. The 
worst-case complexity of this attack for bitstrings of length n 
is the greatest of 2 * t + n and At. The complexity is 2*t + n 
if there are t — 1 bit-errors in the beginning and one at the end 
of the string. The first t—1 errors get corrected by flipping 
them and t additional bits need to be flipped to invalidate the 
sample. Locating the bit-errors requires searching till the end 
of the string where the last error is. The complexity is At if 
there are t — 1 correct bits followed by t wrong bits. So 2t 
flips are needed before the queries no longer match and then 
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2t positions need to be searched. In practice, t < n/2 and 
thus the worst-case complexity is It + n. 

False-Acceptance-Rate Attack: A false acceptance occurs 
if a sample, not coming from Ui, is close enough to bi to 
be recognized by the system as a sample coming from Ui. 
The name comes from the fact that an attacker can take a 
large existing database of samples and feed that to a biometric 
authentication system. Due to the inherent false acceptance 
rates, there will be a sample in the attacker's database that 
matches the reference in the system with high probability. 

The goal of this attack is to learn bi from a matching sample 
that is unknown to the attacker. This attack combines ideas 
from the previous attacks. The attacker is AS, not including 
S, and AS does not know how to compute an output of f% 
that reflects equal (or different) inputs. It is assumed, however, 
that the attacker can replace the components of b[ in the value 
he received from S, i.e., This is definitely the case if 

/i is a concatenation of subfunctions and if AS can compute 
such subfunction f%. 

The actual attack then proceeds as follow. The attacker AS 
waits until a genuine user presents a valid sample. The attack 
is similar as in the center-search attack, only now AS will 
not flip bits but simply replace them with a known value, e.g., 
one. He will do this until the sample no longer matches. Then 
AS already knows that the last bit he replaced was not one 
and he will restore that bit. Then he continues to substitute the 
bits one by, carefully observing whether the sample matches 
or not and learning all the bits. The first bits that were flip to 
invalidate the sample can be learned by simply restoring them. 

C. Attacker = VB or M 

The attacker is the database VB or the matcher M who 
communicate with the authentication service AS only. The 
attackers cannot achieve any of the attack goals individually 
because their blackboxes give output, which cannot be influ- 
enced by the attacker, before receiving input. If these entities 
do not collude with other entities they are simply passive 
attackers and by Assumption 2 they cannot mount any attacks. 

Including the sensor S: If the sensor colludes with the 
database or the matcher, some attacks are trivial: the provided 
sample is known and thus it is also easy to trace a user based 
on the provided sample or identity. 

A powerful attacker is the combination of the A4 and the S, 
as was shown in Section III-C. Because the sensor can send 
any input and any identity, the attacker does not have to wait 
for a matching sample. The same center-search attack can be 
performed as in the case where AS and S are the attacker. 

D. Attacker = AS and T>B 

The attackers (AS and VB) receive fresh input from S and 
communicate with the matcher. They can search the entire 
database and turn to identification although the protocol could 
be designed to operate in verification mode. 

The input sample b\ can be learned in the same way as the 
bi was learned in the attacks of AS, if the same conditions 
hold. Then, depending on the implementation, the attacker can 
learn the entire database because VB will return any bi and 
AS will manipulate it until all bits are known. 



E. Attacker = AS and M. 

The attack goal with the highest impact is to learn the 
reference bi from the database. Depending on how the A4 
implements its functionality this can be a very powerful 
attacker, e.g., if M possesses decryption keys for encrypted 
samples/templates as was the case in the schemes analyzed in 
Sections III-A, III-B and III-C. 

F. Attacker = VB and M 

In this combination of attackers, VB will manipulate its 
output so that it can be of use to the M.. The relevant attack 
goals are to learn b\ and to trace Ui. 

G. Attacker = AS and VB and A4 

In this particular case, the attacker is a combination of AS, 
VB and M, and the goal is to learn b[ L . If the reference bi 
is not stored in the clear by the database, the attacker may 
want to learn b. L also. Tracing U L is almost trivial because the 
attacker can perform a search (identification) on the database 
VB. The attack goals are easily reached if the data can be 
decomposed as explained in the attacks of AS. 

V. Conclusion 

Biometric authentication protocols that are found in the 
literature are usually designed in the honest-but-curious model 
assuming that there are no malicious insider adversaries. In 
this paper, we have challenged that assumption and shown 
how some existing protocols are not secure against such 
adversaries. Such analysis is extremely relevant in the context 
of independent database providers. Much attention was given 
to an authentication server attacker, which is a central and 
powerful entity in our model. To prevent the attacks that 
were presented, stronger enforcement of the protocol design 
is needed: many attacks succeed because transactions can be 
duplicated or manipulated. 
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